Prediction of Protein Subcellular Multi-localization by Using a Min-Max Modular Support Vector Machine
نویسندگان
چکیده
Prediction of protein subcellular location is an important issue in computational biology because it provides important clues for characterization of protein function. Currently, much effort has been dedicated to developing automatic prediction tools. However, most of them focus on mono-locational proteins. It should be noted that many proteins bear multi-locational characteristics, and they carry out crucial functions in biological processes. This work aims to develop a general pattern classifier for predicting multiple subcellular locations of proteins. We used an ensemble classifier, called min-max modular support vector machine (M-SVM), to solve protein subcellular multi-localization problem, and proposed a task decomposition method based on gene ontology (GO) semantic information for the M-SVM. We applied our method to a high-quality multi-locational protein data set. The M-SVMs showed better performance than traditional SVMs using the same feature vectors. And the GO decomposition also helped improve the prediction accuracy with more stable performance than random decomposition.
منابع مشابه
Prediction of Protein Subcellular Multi-locations with a Min-Max Modular Support Vector Machine
How to predict subcellular multi-locations of proteins with machine learning techniques is a challenging problem in computational biology community. Regarding the protein multi-location problem as a multi-label pattern classification problem, we propose a new predicting method for dealing with the protein subcellular localization problem in this paper. Two key points of the proposed method are ...
متن کاملLearning from imbalanced data sets with a Min-Max modular support vector machine
Imbalanced data sets have significantly unequal distributions between classes. This between-class imbalance causes conventional classification methods to favor majority classes, resulting in very low or even no detection of minority classes. A Min-Max modular support vector machine (M-SVM) approaches this problem by decomposing the training input sets of the majority classes into subsets of sim...
متن کاملA comparison of computational strategies for multi-label prediction of protein subcellular localizations
The subcellular localization of a protein is closely correlated with its functions. Although many machine learning algorithms have been developed and applied to predict protein compartments using different data sources, such as protein amino acid sequence and motif information, automatic prediction of subcellular localization remains a challenging problem. In this study, we compared three suppo...
متن کاملProtein Subcellular Localization Prediction for Fusarium graminearum∗
The fungal pathogen Fusarium graminearum (telomorph Gibberella zeae) is the causal agent of several destructive crop diseases. Investigating subcellular localizations of F. graminearum proteins can provide insight into pathogenic mechanisms underlying F. graminearum-host interactions. In this paper, we design a novel balanced ensemble classifier based on support vector machines (SVMs) to predic...
متن کاملMulti-View Face Recognition with Min-Max Modular Support Vector Machines
As a result of statistical learning theory, support vector machines (SVMs)[23] are effective classifiers for the classification problems. SVMs have been successfully applied to various pattern classification problems, such as handwritten digit recognition, text categorization and face detection, due to their powerful learning ability and good generalization ability. However, SVMs require to sol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009